Medical Decision Making
○ SAGE Publications
All preprints, ranked by how well they match Medical Decision Making's content profile, based on 10 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Boettcher, L.; Felder, S.
Show abstract
The use of multiple tests can improve medical decision making. The patient utility maximizing combination of these tests involves balancing the benefits of correctly treating ill patients and avoiding unnecessary treatment for healthy individuals against the potential harms of missed diagnoses or inappropriate treatments. We quantify the incremental net benefit (INB) of single and multiple tests by accounting for a patients pre-test probability of disease and the associated benefits and harms of treatment. We decompose the INB into two components: one that captures the value of information provided by the test, independent of the cost and possible harm of testing, and another that accounts for test costs and harm. We examine conjunctive, disjunctive, and majority aggregation functions, demonstrating their application through examples in prostate cancer, colorectal cancer, and stable coronary artery disease diagnostics. Using empirical test and cost data, we identify decision boundaries to determine when conjunctive, disjunctive, majority, or even single tests are optimal, based on a patients pre-test probability of disease and the cost-benefit tradeoff of treatment. In all three cases, we find that the optimal choice of combined tests depends on both the cost-benefit tradeoff of treatment and the probability of disease. An online tool that visualizes the INB for combined tests is available at https://optimal-testing.streamlit.app/.
Srivastava, T.; Strong, M.; Stevenson, M. D.; Dodd, P. J.
Show abstract
IntroductionDiscrete-time Markov models are widely used within health economic modelling. Analyses usually associate costs and health outcomes with health states and calculate totals for each decision option over some timeframe. Frequently, a correction method (e.g. half-cycle correction) is applied to unadjusted model outputs to yield an approximation to an assumed underlying continuous-time Markov model. In this study, we introduce a novel approximation method based on Gaussian Quadrature (GQ). MethodsWe exploited analytical results for time-homogeneous Markov chains to derive a new GQ-based approximation, which is applied to an unadjusted discrete-time model output. The GQ method approximates a continuous-time Markov model result by approximating a correction matrix, formulated as an integral, using a weighted sum of integrand values at specified points. GQ approximations can be made arbitrarily accurate by increasing order of the approximation. We compared the first five orders of GQ approximation with four existing cycle correction methods (half-cycle correction, trapezoidal and Simpsons 1/3 and 3/8 rules) across 100,000 randomly generated input parameter-sets. ResultsWe show that first-order GQ method is identical to half-cycle correction method, which is itself equivalent to trapezoidal method. The second-order GQ is identical to Simpsons 1/3 method. The third, fourth and fifth order GQ methods are novel in this context and provide increasingly accurate approximations to the output of the continuoustime model. In our simulation study, fifth-order GQ method outperformed other existing methods in over 99.8% of simulations. Of the existing methods, Simpsons 1/3 rule performed the best. ConclusionOur novel GQ-based approximation outperforms other cycle correction methods for time-homogeneous models. The method is easy to implement, and R code and an Excel workbook are provided as supplementary materials.
Bollee, M.; Dutta Majumdar, A.
Show abstract
Discrete-time Markov cohort-state transition models are now well-established as the preferred choice of analysts across application areas including health technology assessment. This preference arises out of its relative intuition and its capability to strike a fine balance between complex disease pathways, statistical precision, and parsimony although being criticized by a wide variety of stakeholders. Transition probability matrices (TPMs) are the "heart and soul" of such models responsible for estimating patient dispositions. However, estimating such TPMs comes with its own set of challenges. In some situations, the transition data may be censored such that the health state of a patient is unknown for multiple time steps before the next observation or data immaturity especially in rare diseases. Craig and Sendi proposed the expectation-maximization (EM) algorithm using uniform weights as a solution for unequal estimation intervals for partially observed data. However, this typically comes at the cost of increased within-state output variations with no optimization technique available in the literature. The objective of this paper is to explore an optimized weighted version of the original EM algorithm, that aims to estimate the set of weights which minimizes the uncertainty of the estimated TPM against a target objective function. The weighting reduces the uncertainty of the estimate by considering the difference in temporal sparsity of the data when there are missing time steps. Further, we demonstrate the applicability of this weighting method using a fictitious cost-effectiveness model with our approach, showing a fine but definitive change over the original approach.
Pi, S.; Rutter, C.; Pineda-Antunez, C.; Chen, J. H.; Goldhaber-Fiebert, J. D.; Alarid-Escudero, F.
Show abstract
Simulation models inform health policy decisions by integrating data from multiple sources and forecasting outcomes when there is a lack of comprehensive evidence from empirical studies. Such models have long supported health policy for cancer, the first or second leading cause of death in over 100 countries. Discrete-event simulation (DES) and Bayesian calibration have gained traction in the field of Decision Science because they enable flexible modeling of complex health conditions and produce estimates of model parameters that reflect real-world disease epidemiology and data uncertainty given model constraints. This uncertainty is then propagated to model-generated outputs, enabling decision makers to assess confidence in recommendations and estimate the value of collecting additional information. However, there is limited end-to-end guidance on structuring a DES model for cancer progression, estimating its parameters using Bayesian calibration, and applying the calibration outputs to policy evaluation. To fill this gap, we introduce the DES Modeling Framework for Cancer Interventions and Population Health in R (DESCIPHR), an open-source codebase integrating a flexible DES model for the natural history of cancer, Bayesian calibration for parameter estimation, and an example application of screening strategy evaluation. To illustrate the framework, we apply DESCIPHR to calibrate bladder and colorectal cancer models to real-world cancer registry targets. We also introduce an automated method for generating data-informed parameter prior distributions and increase the functionality of a neural network emulator-based Bayesian calibration algorithm. We anticipate that the adaptable DESCIPHR modeling template will facilitate the construction of future decision models evaluating the risks and benefits of health interventions. Key points for decision makersO_LIFor simulation models to be useful for decision-making, they should accurately reproduce real-world outcomes and their uncertainty. C_LIO_LIThe DESCIPHR framework and code repository address a gap in open-source resources to fit an individual-level model for cancer progression to real-world data and forecast the impact of cancer screening interventions while accounting for data uncertainty. C_LIO_LIThe codebase is designed to be highly adaptable for researchers who wish to apply DESCIPHR for economic evaluation or for studying methodological questions. C_LI
Zhao, X.; Gopalappa, C.
Show abstract
BackgroundWomen with HIV face elevated cervical cancer risks, compounded by social conditions that influence both disease outcomes. Current models fail to adequately capture the complex interactions between diseases and social determinants. MethodsWe enhanced a mixed agent and compartment model for HIV and cervical cancer (MAC-HIV-CC) to model disparities by social conditions. We analyzed the impact of hypothetical 100% efficacious interventions over 30 years (2018-2048): (1) an HIV care intervention that eliminates disparities in viral load suppression between social groups, (2) a sexual behavior intervention aligning behaviors of women who exchange sex with those who do not, and (3) a combination of both interventions. ResultsThe HIV care intervention reduced HIV incidence by 26.9% and cervical cancer cases by 14.5% among HIV-positive women. The sexual behavior intervention decreased HIV prevalence by 8.1% and HPV prevalence by 36.1% among HIV-positive women engaged in exchange sex. The combination intervention reduced HIV prevalence by 25.3%, HIV incidence by 34.3%, and cervical cancer cases by 37.5% in the target population. ConclusionsThe proposed framework provides a novel approach for health equity analyses by modeling social determinants that are common pathways to interrelated diseases and health disparities. Such a model is of significance for cost-effectiveness intervention analyses of interrelated diseases.
Silver, J.
Show abstract
Testing people without symptoms for SARS-CoV-2 followed by isolation of those who test positive could mitigate the covid-19 epidemic pending arrival of an effective vaccine. Key questions for such programs are who should be tested, how often, and when should such testing stop. Answers to these questions depend on test and population characteristics. A cost-effectiveness model that provides answers depending on user-adjustable parameter values is described. Key parameters are the value ascribed to preventing a death and the reproduction number (roughly, rate of spread) at the time surveillance testing is initiated. For current rates of spread, cost-effectiveness usually requires a value per life saved greater than $100,000 and depends critically on the extent and frequency of testing.
Hegarty, S. E.; Linn, K. A.; Zhang, H.; Teeple, S.; Albert, P. S.; Parikh, R. B.; Courtright, K.; Kent, D. M.; Chen, J.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWThe proliferation of algorithm-assisted decision making has prompted calls for careful assessment of algorithm fairness. One popular fairness metric, equal opportunity, demands parity in true positive rates (TPRs) across different population subgroups. However, we highlight a critical but overlooked weakness in this measure: at a given decision threshold, TPRs vary when the underlying risk distribution varies across subgroups, even if the model equally captures the underlying risks. Failure to account for variations in risk distributions may lead to misleading conclusions on performance disparity. To address this issue, we introduce a novel metric called adjusted TPR (aTPR), which modifies subgroup-specific TPRs to reflect performance relative to the risk distribution in a common reference subgroup. Evaluating fairness using aTPRs promotes equal treatment for equal risk by reflecting whether individuals with similar underlying risks have similar opportunities of being identified as high risk by the model, regardless of subgroup membership. We demonstrate our method through numerical experiments that explore a range of differential calibration relationships and in a real-world data set that predicts 6-month mortality risk in an in-patient sample in order to increase timely referrals for palliative care consultations.
Nascimento de Lima, P.; Rutter, C.; Maerzluft, C.; Ozik, J.; Collier, N.
Show abstract
Colorectal Cancer (CRC) is a leading cause of cancer deaths in the United States. Despite significant overall declines in CRC incidence and mortality, there has been an alarming increase in CRC among people younger than 50. This study uses an established microsimulation model, CRC-SPIN, to perform a stress test of colonoscopy screening strategies. First, we expand CRC-SPIN to include birth-cohort effects. Second, we estimate natural history model parameters via Incremental Mixture Approximate Bayesian Computation (IMABC) for two model versions to characterize uncertainty while accounting for increased early CRC onset. Third, we simulate 26 colonoscopy screening strategies across the posterior distribution of estimated model parameters, assuming four different colonoscopy sensitivities (104 total scenarios). We find that model projections of screening benefit are highly dependent on natural history and test sensitivity assumptions, but in this stress test, the policy recommendations are robust to the uncertainties considered.
Gracia, V.; Goldhaber-Fiebert, J. D.; Alarid-Escudero, F.
Show abstract
PurposeWe introduce PRE-CISE, a pre-calibration workflow that integrates coverage analysis, local sensitivity, and collinearity diagnostics to streamline model calibration and transparently address nonidentifiability. We demonstrate the benefits of PRE-CISE using a four-state Sick-Sicker Markov testbed and a COVID-19 case study. MethodsPRE-CISE begins with a coverage analysis to verify that model outputs generated with parameter sets drawn from their prior distribution span calibration targets, followed by local sensitivities to quantify the influence of parameters on model outputs, guiding the resizing of the prior distribution bounds to improve coverage. Identifiability is then assessed via collinearity analysis; large indices indicate practical nonidentifiability. For the testbed model, we calibrated 3 parameters to survival, prevalence, and the proportion of Sick to Sicker at 10, 20, and 30 years. For the COVID-19 model, we calibrated 11 parameters to match daily confirmed incident cases. Bayesian calibration was conducted on both analyses. ResultsCoverage analyses flagged initial misfits; local sensitivities identified the Sick-to-Sicker transition probability has a greater effect on model outputs, and resizing its prior distribution bounds improved coverage. Collinearity analyses showed that combining multiple calibration targets across time points enabled recovery of all three parameters. In the COVID-19 model, local sensitivity analyses prioritized time-varying detection rates and contact-reduction effects, reducing the search space, thereby improving calibration efficiency. Daily incident case calibration targets yielded collinearity indices below practical thresholds (e.g., < 15) for all parameter combinations, whereas weekly calibration targets were larger and closer to the cutoff. ConclusionsPRE-CISE provides a practical, transparent pathway that helps modelers refine prior distribution bounds and calibration targets before intensive calibration, improving uncertainty reporting and strengthening the reliability of model-based health policy analyses.
Wong, C. H.; Li, D.; Wang, N.; Gruber, J.; Conti, R.; Lo, A. W.
Show abstract
We assess the potential financial impact of future gene therapies by identifying the 109 late-stage gene therapy clinical trials currently underway, estimating the prevalence and incidence of their corresponding diseases, developing novel mathematical models of the increase in quality-adjusted life years for each approved gene therapy, and simulating the launch prices and the expected spending of these therapies over a 15-year time horizon. The results of our simulation suggest that an expected total of 1.09 million patients will be treated by gene therapy from January 2020 to December 2034. The expected peak annual spending on these therapies is $25.3 billion, and the total spending from January 2020 to December 2034 is $306 billion. We decompose their annual estimated spending by treated age group as a proxy for U.S. insurance type, and consider the tradeoffs of various methods of payment for these therapies to ensure patient access to their expected benefits.
McLaren, Z. M.
Show abstract
The data-driven targeting of COVID-19 vaccination programs is a major determinant of the ongoing toll of COVID-19. Targeting of access to, outreach about and incentives for vaccination can reduce total deaths by 20-50 percent relative to a first-come-first-served allocation. This piece performs a systematic review of the modeling literature on the relative benefits of targeting different groups for vaccination and evaluates the broader scholarly evidence - including analyses of real-world challenges around implementation, equity, and other ethical considerations - to guide vaccination targeting strategies. Three-quarters of the modeling studies reviewed concluded that the most effective way to save lives, reduce hospitalizations and mitigate the ongoing toll of COVID-19 is to target vaccination program resources to high-risk people directly rather than reducing transmission by targeting low-risk people. There is compelling evidence that defining vulnerability based on a combination of age, occupation, underlying medical conditions and geographic location is more effective than targeting based on age alone. Incorporating measures of economic vulnerability into the prioritization scheme not only reduces mortality but also improves equity. The data-driven targeting of COVID-19 vaccination program resources benefits everyone by efficiently mitigating the worst effects of the pandemic until the threat of COVID-19 has passed.
Blissett, R. S.; Sullivan, W.; Subban, I.; Igloi-Nagy, A.
Show abstract
Cohort-level models in Microsoft Excel(R) remain the standard for cost-effectiveness modelling to inform health technology assessment (HTA), despite calls and rationale for more flexible approaches. Their limited ability to capture patient-level characteristics can, in the presence of patient heterogeneity or the need to track patient characteristics to accurately capture a technologys implications, introduce bias. Their continued prevalence is explained by key stakeholders familiarity with spreadsheet software, and the lower computational burden of cohort-level versus patient-level models. However, contemporary Excel functions have opened up possibilities for efficient calculations within native Excel that enable more flexible, patient-level approaches to be implemented in familiar spreadsheet-based software. Therefore, this tutorial aims to provide step-by-step guidance on how to implement a previously published and freely available individual-level discrete event simulation (DES) in Excel, using contemporary Excel functions and without any Visual Basic for Applications (VBA) code. Key Points for Decision-MakersO_LIPerceived and real requirements for cost-effectiveness models for HTA to be built in Excel may have led to overuse of cohort-level approaches, with probable bias implications for HTA decision-making. C_LIO_LIContemporary Excel functions now allow the efficient implementation and execution of patient-level model calculations within native Excel, without any VBA code. Such capabilities may reduce technical barriers across key stakeholders, enhance transparency, and ultimately lead to improvements in HTA decision-making. C_LIO_LIThis tutorial demonstrates provides step-by-step guidance on how to implement an efficient patient-level cost-effectiveness model in Excel without any VBA, with an executable model example included as supplementary material. C_LI
Chen, Z.; Marrero, W. J.
Show abstract
The unintended biases introduced by optimization and machine learning (ML) models are a topic of great interest to medical researchers and professionals. Bias in healthcare decisions can cause patients from vulnerable populations (e.g., racially minoritized, low-income, or living in rural areas) to have lower access to resources and inferior outcomes, thus exacerbating societal unfairness. In this systematic literature review, we present a structured overview of the literature regarding fair decision making in healthcare until April 2024. After screening 801 unique references, we identified 114 articles within the scope of our review. In our review, we comprehensively examine fair decision-making methodologies in healthcare by systematically identifying and categorizing biases within both data and models. Initially, we elucidate existing bias within healthcare decision making. Then, we present a range of fairness metrics drawn from different use cases, followed by analyzing and classifying bias mitigation strategies into pre-processing, in-processing, and post-processing techniques. We provide a broad conceptual overview and practical illustrations of each approach. Additionally, we examine emerging bias mitigation technologies that, though not yet applied in healthcare, show substantial promise for future integration. Our review aims to increase awareness of fairness in healthcare decision making and facilitate the selection of appropriate approaches under varying scenarios.
Schultz, A.; Chang, A. B.; McCallum, G. B.; Lau, G.; Toombs, M.; Barwick, M.; Laird, P.; Morris, P. S.; Norman, R.; Aitken, R.; Walker, R.; Mascaro, S.
Show abstract
Despite the potential of evidence-based medical innovations to improve patient outcomes, their integration remains difficult. Implementation science aims to assist by identifying and deploying effective implementation strategies within complex health care settings. Determinant frameworks, such as the Consolidated Framework for Implementation Research (CFIR), help identify factors influencing implementation success but do not specify mechanisms or methods for selecting optimal strategies. Selection methods are largely empirical, highlighting the need for objective, quantifiable approaches. We developed causal Bayesian networks (BNs) to model the interdependencies amongst contextual factors, determinants and outcomes with a specific example: the detection and management of chonic wet cough in Indigenous Australian children in primary health care settings. The BNs, informed by CFIR domains and prior qualitative research, quantifies the impact of barriers and enablers on implementation outcomes. The BNs enable predictions of intervention effects, and the assessment and quantification of potential implementation strategies, or a combination of strategies. The BNs are linked to a simple survey that allows implementation strategies to be tailored for each setting and that was administered at several sites across Australia to validate the models. The overall process, including the BNs and surveys, constitutes a generalisable structured workflow for selecting the most promising strategies. We describe the model development and validation, and the broader applicability of our BN-based workflow in implementation science.
Gilson, F.; Osstyn, S.; Handels, R.
Show abstract
BACKGROUNDTransparency and credibility of health-economic simulation models is essential to inform reimbursement decisions. Model replication can support model transparency and credibility. Artificial intelligence (AI), particularly large language models, offers new opportunities to accelerate model replication. This led to the research question: "To what extent can the results of existing health-economic Markov models be replicated by models developed using generative AI for eliciting input parameters and code generation?" METHODSReplication was performed in three steps. First, a chain-of-thought prompting strategy in ChatGPT-4 was developed to replicate in R an open-source model co-developed by one of the authors and with publicly available code. Second, it was applied to replicate a model co-developed by one of the authors but without publicly available code. Third, it was applied to a model without the involvement of the authors and without publicly available code. A mixed- methods approach was employed in terms of qualitatively addressing the face validity of the prompt development and refinement and quantitatively assessing deviations between AI- generated and original model predictions. RESULTSThe first model required approximately one month to replicate, while adaptations to the second and third models took approximately two weeks each. Across the three models and 45 replications (15 per model), the average absolute relative deviations between ChatGPT-4 generated model predictions and published results were: [≤]14% for quality-adjusted life years and costs in the first model, [≤]7% in the second model, and [≤]28% in the third model. CONCLUSIONSOur approach could support more time-efficient model replication for reimbursement decision-makers, researchers or pharmaceutical companies. This could contribute to transparency and credibility of health-economic models.
Lee, J. T.; Shen, T. K.-B.; Liu, V. T.-N.; Chen, T. H.-H.; Lee, C.-C.; Lu, C.; Huang, C.-W.; Atun, R.
Show abstract
Economic evaluations of artificial intelligence (AI) in healthcare are expanding rapidly, yet underlying costing methods remains heterogenous, and frequently incomplete for health technology assessment (HTA) and policy decision-making. In our systematic review of 55 studies published between 2010 and 2025, we found that fewer than half of the studies reported explicit costing methods; most pricing analyses failed to describe the basis of fees, subscription terms, or duration of coverage; and few analyses distinguished between average and incremental costs or accounted for economies of scale. Lifecycle expenditures--including development, validation, integration, maintenance, retraining, and decommissioning--were largely omitted, while electricity consumption, data hosting, and cloud infrastructure costs were almost never considered. Sensitivity analysis was the exception rather than the norm, and reporting of cost offsets such as reduced hospital admissions or workforce time savings was inconsistent. To address these gaps, we propose a 20-item reporting checklist to standardise the costing and pricing of AI interventions. The checklist complements existing HTA frameworks while capturing features unique to AI, such as continuous retraining, reliance on data infrastructure, and recurrent maintenance. We also introduce an AI Costing Inventory and Calculator that operationalises a lifecycle approach, enabling systematic recording of resource use, unit costs, inflation adjustments, and total and incremental costs, including offsets. These tools extend the emerging CHEERS-AI reporting framework by embedding a lifecycle perspective into costing, thereby enabling consistent estimation of resource and cos components and strengthening the methodological foundations of AI economic evaluation for policy use.
Grieco, L.; Utley, M.; Crowe, S.
Show abstract
The sustainable delivery of Home Health Care (HHC) is increasingly important as ageing populations and rising demands place pressure on health and social care systems. HHC involves complex decision-making across strategic, tactical, and operational levels, often constrained by limited resources such as workforce availability. While Operational Research (OR) has been widely applied to support operational decisions in HHC, recent reviews highlight a lack of focus on strategic and tactical planning, and limited recognition of the hierarchical structure linking decisions across levels. This undermines the potential of OR to inform real-world planning effectively. To address these gaps, we propose a modelling approach enabling consistent analysis of decisions across planning levels. We defined a modular approach for analysing hierarchies of decisions and accounting for cascade effects. Then, we applied those principles by developing a configurable tool in R, comprising a synthetic data generator and a suite of optimisation and heuristic routines. We illustrate the benefits of this approach through a case study. Our results demonstrate the value of this structured modelling approach for informing decisions in HHC. The emphasis we gave on modularity facilitated the development of an analysis tool that can be easily adapted to different hierarchies of decisions and settings.
Bertsimas, D.; Li, M. L.; Soni, S.
Show abstract
Since December 2019, the world has been ravaged by the COVID-19 pandemic, with over 150 million confirmed cases and 3 million confirmed deaths worldwide. To combat the spread of COVID-19, governments have issued unprecedented non-pharmaceutical interventions (NPIs), ranging from mass gathering restrictions to complete lockdowns. Despite their proven effectiveness in reducing virus transmission, the policies often carry significant economic and humanitarian cost, ranging from unemployment to depression, PTSD, and anxiety. In this paper, we create a data-driven system dynamics framework, THEMIS, that allows us to compare the costs and benefits of a large class of NPIs in any geographical region across different cost dimensions. As a demonstration, we analyzed thousands of alternative policies across 5 countries (United States, Germany, Brazil, Singapore, Spain) and compared with the actual implemented policy. Our results show that moderate NPIs (such as restrictions on mass gatherings) usually produce the worst results, incurring significant cost while unable to sufficiently slow down the pandemic to prevent the virus from becoming endemic. Short but severe restrictions (complete lockdown for 4-5 weeks) generally produced the best results for developed countries, but only if the speed of reopening is slow enough to prevent a resurgence. Developing countries exhibited very different trade-off profiles from developed countries, and suggests that severe NPIs such as lockdowns might not be as suitable for developing countries in general.
Ayres, I.; Romano, A.; Sotis, C.
Show abstract
Due to network effects, Contact Tracing Apps (CTAs) are only effective if many people download them. However, the response to CTAs has been tepid. For example, in France less than 2 million people (roughly 3% of the population) downloaded the CTA. Against this background, we carry out an online experiment to show that CTAs can still play a key role in containing the spread of COVID-19, provided that they are re-conceptualized to account for insights from behavioral science. We start by showing that carefully devised in-app notifications are effective in inducing prudent behavior like wearing a mask or staying home. In particular, people that are notified that they are taking too much risk and could become a superspreader engage in more prudent behavior. Building on this result, we suggest that CTAs should be re-framed as Behavioral Feedback Apps (BFAs). The main function of BFAs would be providing users with information on how to minimize the risk of contracting COVID-19, like how crowded a store is likely to be. Moreover, the BFA could have a rating system that allows users to flag stores that do not respect safety norms like wearing masks. These functions can inform the behavior of app users, thus playing a key role in containing the spread of the virus even if a small percentage of people download the BFA. While effective contact tracing is impossible when only 3% of the population downloads the app, less risk taking by small portions of the population can produce large benefits. BFAs can be programmed so that users can also activate a tracing function akin to the one currently carried out by CTAs. Making contact tracing an ancillary, opt-in function might facilitate a wider acceptance of BFAs.
Cusick, M. M.; Alarid-Escudero, F.; Goldhaber-Fiebert, J. D.; Rose, S.
Show abstract
PurposeHealth policy simulation models incorporate disease processes but often ignore social processes that influence health outcomes, potentially leading to suboptimal policy recommendations. To address this gap, we developed a novel decision-analytic modeling framework to integrate social processes. MethodsWe evaluated a simplified decision problem using two models: a standard decision-analytic model and a model incorporating our social factors framework. The standard model simulated individuals transitioning through three disease natural history states-healthy, sick, and dead-without accounting for differential health system utilization. Our social factors framework incorporated heterogeneous health insurance coverage, which influenced disease progression and health system utilization. We assessed the impact of a new treatment on a hypothetical cohort of 100,000 healthy, non-Hispanic Black and non-Hispanic white 40-year-old adults. Primary outcomes included life expectancy, cumulative incidence and duration of sickness, and health system utilization throughout a persons lifetime. Secondary outcomes included costs, quality-adjusted life years, and incremental cost-effectiveness ratios. ResultsIn the standard model, the new treatment increased life expectancy by 2.7 years for both non-Hispanic Black and non-Hispanic white adults, without affecting racial/ethnic gaps in life expectancy. However, incorporating known racial/ethnic disparities in health insurance coverage with the social factors framework led to smaller life expectancy gains for non-Hispanic Black adults (2.0 years) compared to non-Hispanic white adults (2.2 years), increasing racial/ethnic disparities in life expectancy. LimitationsThe availability of social factors data and complexity of causal pathways between factors may pose challenges in applying our social factors framework. ConclusionsExcluding social processes from health policy modeling can result in unrealistic projections and biased policy recommendations. Incorporating the social factors framework enhances simulation models effectiveness in evaluating interventions with health equity implications. HighlightsO_LIHealth policy simulation models that ignore social processes may be biased and lead to suboptimal policy recommendations. To address this, we proposed a novel social factors framework to integrate social factors into decision-analytic models for health policy. C_LIO_LIApplying our social factors framework to a simplified example highlighted the potential bias that results from ignoring social factors. In a standard model, a hypothetical new treatment appeared to have no effect on health disparities. However, incorporating our social factors framework demonstrated that this treatment would exacerbate disparities. C_LIO_LIIncorporating a social factors framework into health policy simulation models has particular relevance for evaluating health interventions with equity implications. C_LI